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This paper provides a study and discussion of earlier as well as novel schemes 
for the precise evaluation of finite-temperature response functions of strongly cor- 
related quantum systems in the framework of the time-dependent density matrix 
renormalization group (tDMRG). The computational costs as functions of time and 
temperature are examined at the example of the XXZ Heisenberg chain in the critical 
XY phase and the gapped Neel phase. The matrix product state purifications occur- 
ring in the algorithms are in one-to-one relation with corresponding matrix product 
operators. This notational simplification elucidates implications of quasi-locality on 
the computation costs. Based on the observation that there is considerable freedom 
<—! in designing efficient tDMRG schemes for the calculation of dynamical correlators 

Qh at finite temperatures, a new class of optimizable schemes, as recently suggested 

in arXiv: 1212.3570, is explained and analyzed numerically. A specific novel near- 
optimal scheme that requires no additional optimization reaches maximum times 
^ t that are typically increased by a factor of two, when compared against earlier ap- 

i— i proaches. These increased reachable times make many more physical applications 

t accessible. For each of the described tDMRG schemes, one can devise a correspond- 

^ ing transfer matrix renormalization group (TMRG) variant. 
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I. INTRODUCTION 

This paper addresses the efficient evaluation of finite-temperature response functions 

XAb(Pi *) := IT Tr(e- p6 B(t)A) with B(t) = e i6t Be~ i6t (1) 

Z{3 

for quantum many-particle systems on one-dimensional (ID) lattices in the framework of 
the density matrix renormalization group (DMRG) [1-3]. From the theoretical perspective, 
such quantities occur for example in the context of linear response theory and characterize 
the effect of a perturbation A of the system at time zero on the expectation value of an 
observable B at time t. Initially, the system is in thermal equilibrium, where /3 = 1/T is 
the inverse temperature and Zp = Tre _/3if . The units are chosen such that Boltzmann's 
and Planck's constants are = 1 and h = 1. Such response functions contain important 
information on the many-body physics and are addressed in many experimental setups. For 
example, recent advances in neutron-scattering techniques make very precise measurements 
possible. See for example Refs. [4-7]. It is hence very important to have numerical tools 
at hand that allow for an efficient and precise evaluation of thermal response functions 
for (strongly-correlated) condensed matter models in order to match theoretical models to 
actual materials and to gain an understanding of the underlying physical processes. 

As discussed and demonstrated in several works [8-10], finite-temperature response func- 
tions for strongly-correlated ID systems can be evaluated up to some maximum reachable 
time by using the time-dependent density matrix renormalization group (tDMRG) [11-13] 
applied to a purification [14-16] of the density matrix. In the DMRG approach, those pu- 
rifications are approximated with a controllable precision by matrix product states (MPS) 
[17-19]. To address finite-temperature states with tDMRG was suggested in Refs. [20, 21] 
with first applications in Refs. [22, 23]. A difficulty in the simulations of time-evolved states 
is the growth of entanglement with time [24-26]. In tDMRG calculations, this leads to a cor- 
responding, typically exponential, increase of the computation cost with time and a strong 
limitation of the maximum reachable times, depending on the available computational re- 
sources and the desired accuracy of the simulation. The effect is much more drastic for 
mixed states. The focus of this paper is hence to study the computation cost of the different 
tDMRG schemes for the evaluation of thermal response functions and to suggest novel more 
efficient approaches that allow for a substantial increase in the maximum reachable times. 

The results of such simulations can be Fourier transformed to study the spectral properties 
of the response. In order to avoid ringing artifacts from the Fourier transformation of the 
data on a restricted time interval, one can either use filters, which result in an artificial 
broadening, or use linear prediction [27, 28]. This was employed first in Ref. [29] for T = 
and in Ref. [8] for T > 0. See also Refs. [30, 31], for further applications of linear prediction 
in DMRG calculations at T = 0. 

Earlier alternative DMRG approaches for finite-temperature response functions have built 
on the transfer matrix renormalization group (TMRG) [32-35]. In a first step, analytic con- 
tinuation techniques as in Quantum Monte Carlo (QMC) were employed [36]. In a second 
development, transfer matrices comprising the imaginary- and the real-time evolution were 
used to access autocorrelation functions [37, 38]. In finite-temperature QMC calculations 
(e.g. positive-definite path integral [39, 40] or stochastic series expansion with directed loops 
[41]), real-time or real- frequency response functions have to be extracted by analytic contin- 
uation from imaginary-time results [42] which is ill-conditioned and numerically challenging. 
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Finite temperatures have also been addressed by a hybrid algorithm based on Monte Carlo 
and tDMRG [43, 44]. 

In this paper, I discuss how the simulation based on the evolution of a certain MPS pu- 
rification of the thermal density matrix as described in [8, 10, 20] can actually be understood 
as evolving a matrix product operator (MPO) representation of the square root of the ther- 
mal density matrix. The description in terms of purifications is in this sense superfluous. 
A decisive observation is now that there are considerable degrees of freedom for the choice 
of a specific evaluation scheme. For the two specific schemes, corresponding to Refs. [8, 9] 
(scheme A) and Ref. [10] (scheme £>), respectively, I examine the scaling of the computation 
cost with time. It turns out that scheme A has some advantage at low temperatures and 
scheme B is advantageous at higher temperatures, for which I give an explanation on the 
basis of quasi-locality [45, 46]. The maximum reachable times t^^(/3) and t^^(/3) differ, 
for the same computation cost and accuracy, by a factor of order one. It is then shown that 
an optimized evaluation scheme, for which one optimizes over the aforementioned degrees of 
freedom, yields maximum reachable times that are at least twice as large as for scheme B ) 
specifically Cax( 2 /3) ^ 2t max(/3)- In a11 cases one finds that ^max(2/3) « i max (/3) as t max varies 
slowly as a function of log /3. Due to the typical dependence of the computation cost on the 
degrees of freedom in the evaluation scheme, we can devise a new scheme C that does not 
require any optimization, but outperforms scheme B by a factor of two in the maximum 
reachable time. It is only outdone by scheme A at very low temperatures. Finally, the 
connection between scheme A and an earlier TMRG approach [37] is explained. For each of 
the tDMRG schemes, one can devise a corresponding TMRG variant. 

Due to the substantially increased reachable times, the novel scheme £7, which was intro- 
duced recently in Ref. [47] and is explained here in detail, can be expected to be the method 
of choice for future applications. In Ref. [47], it allowed us to show that the thermal spectral 
functions of ID bosons in the quantum critical regime with dynamic critical exponent z = 2 
follows a universal scaling form and to compute the corresponding scaling function precisely. 



II. MPS PURIFICATION VERSUS MPO DENSITY MATRIX 

As described for example in Refs. [8, 10, 20], the (unnormalized) thermal density matrix 

for a chain of L sites can be obtained in the form of an MPS purification 

\pp) eU®U aux , H aux ~U, (2a) 
\pp)= T A^A^...Al L '<\a 1 a 2 ...a L )^\a[a' 2 ...a' L ) aux , (2b) 

cr^,...,cr£ ^ ^ 

a[,...y L ='\CT) =:|0' / )aux 

where the |a$) label orthonormal site basis states for %, the auxiliary Hilbert space 

is isomorphic to %, and |cr^) aux label orthonormal site basis states for % a ux- The MPS is 

constructed from M^_i x Mi matrices A° z,a % where M = Ml = 1. The matrix sizes {Mi} 
are also called bond dimensions; see for example the review [3]. 
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In order for \pp) to be a purification of pp, it needs to be constructed such that 



Tr aux |^><^| = \cr){aa f \pp){pp\cT ,f cT f ){c 

ctct'cj" 

This can be achieved by choosing the infinite-temperature purification to be 

L 

|1) With |1) := Y k) ® k)aux = ki> ® l^)au: 



\P0 



which can be written as an MPS (2) with bond dimensions Mi = 1. In this so-called ancilla 
approach, finite-temperature purifications can be calculated by imaginary-time evolution 

|p/3) = e-^ /2 ®l aux .|l). 
and response functions are evaluated after a subsequent real-time evolution 

XA6tf*t) = ^[(p^|e^]S[e-^i^>]. (3) 

The evolution of the MPS can for example be implemented by decomposing the propaga- 
tors into circuits of local gates by a Suzuki- Trotter decomposition [11-13] as described in 
Section VI. The square brackets in Eq. (3) indicate which parts of the expression are rep- 
resented as MPS after the evolution. Note that Zp, which is required in Eq. (3), can be 
determined as (pp\pp) = Tre _/3if = Zp. Equivalently, one can work with purifications \pp) 
that are normalized to one. 

In each step of the tDMRG, the evolved states = \ip(/3,t)) are approximated by an 
MPS with bond dimensions Mi = M^(/3, t) that are as small as possible for a given constraint 
on the desired precision of the approximation [3] . This is achieved through truncations. For 
every splitting of the system into a left and a right part, one does a Schmidt decomposition 
1^) = Sfcli ^k\k)L ® \k)R of the state [16] which boils down to doing singular value decom- 
positions of the tensors A4. The corresponding reduced density matrices are X^^I^^^U 
and Xl\k)n(k\R. The bond dimension is then reduced from M to some value M < M by 
retaining only the M largest Schmidt coefficients and truncating all smaller ones. 

M M 
k=l k=l 

The precision is in each step of the algorithm controlled by bounding the truncation weight 

H^trunc -^llV T,k>M X l 



^2k x l 



(5) 



As a matter of fact, the notation of this procedure as an algorithm on purifications is in 
a sense superfluous. Due to the fact that B(H), the space of linear maps on %, and the 
tensor product space %®1-L are isomorphic, everything can be rewritten in terms of MPOs 
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Figure 1: An MPS purification |X) and the corresponding MPO X, according to the isomorphism 
(6). In the graphical representation, boxes represent the tensors A{ that define the MPS or MPO, 
and lines represent partial tensor contractions. The diagram for Tt(YBX) represents the tensor 
network that has to be contracted in order to evaluate the trace of two MPOs X and Y, and a local 
observable B as occurring in the tDMRG schemes for the evaluation of thermal response functions. 
See for example Eq. (9). The computational cost is determined by the bond dimensions of the 
MPOs as in Eq. (10). 



in B(H) instead of MPS purifications in T~L ® % ) 

{ctct'\X) = {<t\X\<t') V w /. (6) 
Examples for this one-to-one relation are 

|1) i — > 1, 

Y®Z aux \X) <— ► YXZ T , 

where the transposition is to be executed in the |cr) basis. In particular, the counterpart of 
the MPS purification (2) is the MPO 



5>i 



A" 2 2 ' U2 ---A a L L ' aL \(T)((T'\ 



(7) 



See for example Refs. [21, 48, 49] for discussions of MPOs. 

Applying evolution operators e~ Af3H or e ±lAtH to the left or right of an MPO X can be 
done by tDMRG in exactly the same way as for MPS, for example, by a Suzuki- Trotter 
decomposition as described in Section VI. The truncation weight, which is kept below a 
certain bound in order to control the precision, is then given by 



trunc 



XI 



\x\ 



(8) 
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Figure 2: Evolution schemes A and i? for the evaluation of the response function according to 
equations (9) and (12). In the DMRG simulations, the operators in square brackets are approxi- 
mated as MPOs. The computation costs (10) for given t and (3 depend on the chosen scheme and 
the desired accuracy (8) of the MPO approximations. 



where ||X|| 2 X^X is the Schatten 2-norm. This is the natural norm within the 

DMRG framework. 



III. EVALUATION SCHEMES A AND B 

Evaluation scheme A as described and employed in Refs. [8, 9] corresponds to Eq. (3). 
In the MPO notation it reads 



XUM = ^Tr ([e-^e^]B[e-^Ae-^]) . 



(9) 



The square brackets indicate which parts of the expression are represented as MPOs after 
the evolution. Figure 2 shows the algorithm diagrammatically. The partition function Zp 
can be obtained from the Schatten 2-norm Zp = (||[e-" H/2 ]|| 2 ) 2 . In the DMRG method, 

a- a' 

multiplications of the matrices A/' i and singular value decompositions, required for the 
truncations of the MPOs, dominate the computation costs. Hence, the computation costs 
for a single evolution step for the first MPO (bond dimensions Mi) and the second MPO 
(bond dimensions M/), and for the final evaluation of Eq. (9) scale as 



L 



Y,Ml ^(Mtf, and ^ max (Mi(M-) 2 , M?M-) , 



(10) 



1=1 



1=1 



1 = 1 



respectively; see Figure 1. 

For a nonsingular operator T e B{1~L), the value of (9) is invariant under transformations 



[ e -fiH/2 e iHt] _^ [f e -f)H/2 e iHt] (Ua) 

[e-^Ae-M 2 ] — ► [e-^Ae-^f- 1 ] (lib) 
of the involved MPOs [62] . This is a considerable degree of freedom that can be exploited to 
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Figure 3: Zero-temperature phase diagram of the spin- 1/2 XXZ Heisenberg model in a magnetic 
field (13). The ground state is fully polarized in the ferromagnetic phase. In the Neel phase, it has 
a finite staggered magnetization. The system is critical (vanishing energy for excitations) in the 
XY, or spin-liquid, phase (gray). The phase boundaries can be obtained from the Bethe ansatz 
[51, 52]. In this work, the properties of different tDMRG schemes for the evaluation of thermal 
response functions are exemplified with this model at the three points J z = 0, 1, 3 with h = 0. 



reduce, for given f3 and i, the bond dimensions Mi of the MPOs and, hence, to reduce the 
computation cost. In cases where such a reduction is possible, one can extend the simulation 
to longer times t. Some numerical experiments showed that the corresponding computation 
cost for the optimization of T scales exponentially with the system size. So a search for 
globally optimal T seems to be a hopeless exercise. It is hence reasonable to constrain 
ourselves to certain classes of transformations, e.g., T = e -$' H e - %m ' and to optimize with 
respect to the degrees of freedom of the class - /3' and t' in that case. See Section V. 

The modified evaluation scheme B (see Fig. 2) employed in Refs. [10, 50] corresponds to 
the choice T = e~ lHt and reads in the operator notation 

= _L Tr ( [e-^' 2 ] B [e~ iAt Ae-^/ 2 e iAt ] ) . (12) 



IV. COSTS OF SCHEMES A AND B AND THEIR EXPLANATION 



In order to study the computational costs of evolution schemes A and B [Eqs. (9) and 
(12)], as functions of time and temperature, I choose as an example the spin-1/2 XXZ 
Heisenberg model 

L-l L 

H = Y,(Sf§»i + SfSf +1 + J z StSt +1 ) -h^St- (13) 

1=1 1=1 

The phase diagram [51-53], as derived by the Bethe ansatz, is shown in Figure 3. The 
simulations are carried out for vanishing magnetic field h — 0, system size L = 128, and 
coupling constants J z = and J z — 1, where the model is gapless, and at J z = 3, where 
the model is in its gapped antiferromagnetic phase. For the time-evolution, a fourth order 
Suzuki- Trotter decomposition with step sizes A/3 = At = 1/8 is used. The truncation 
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Figure 4: Computation cost per time step (%2i(Mi(f3,t)) 3 ) for scheme A (top) and scheme B 
(bottom) with J z = 0, 1, and 3, W — A — 5^ 2 ? an d truncation weights per time step e# = 10 -12 , 
et = 10 -10 . The contour lines correspond to maximum reachable times for different computational 
resources (per time step) and the predefined precision of the simulations. 
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Figure 5: Maximum bond dimensions max^M^(/3,t) for scheme A (top) and scheme B (bottom) 
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weights in the imaginary-time and real-time evolutions were fixed to values ep and e t which 
are specified in the captions of the corresponding figures. 

Figure 4 shows the computation cost per time step for the operators A = W = S^ 2 in 
Eqs. (9) and (12), i.e., spin-flips in the middle of the system. The density plots show, as a 
function of time and temperature, the number of operations needed for a single time step, 
as measured by J2i(Mi(f3,t)) 3 . For scheme A, the cost is dominated by the cost for the 
computation of the MPO 

[e- iflt Ae-^l 2 ]. (14) 
For scheme £>, the computation of the MPO 

[e- iAt Ae-^ 2 e iAt ] (15) 

dominates the numerical costs. Hence, the bond dimensions M^(/5, t) of those operators were 
used in the diagrams. Please notice the logarithmic scale for /3 = 1/T and the cost, i.e., 
the color coding and the contour lines. The contour lines correspond to maximum reach- 
able times for certain computational resources and a predefined precision of the simulation, 
determined by ep and e t . Figure 5 shows for the same simulations, the maximum bond 
dimensions max^i^M^/?, t). 

First of all, the results show that the computation costs increase, for fixed temperature, 
exponentially with the time. This corresponds to a linear increase of the entanglement 
entropy in the evolution of pure states after a quench [24, 25]. For the noncritical case, 
J z = 3, the costs per time step become /3-independent for temperatures that are sufficiently 
below the excitation gap. At higher temperatures, the cost for scheme B is systematically 
smaller than that for scheme A. The effect is strongest for the non-interacting system at 
J z = 0. For J z = 1 and J z = 3, the increase in the maximum reachable times is more 
moderate with a factor of < 1.4. At lower temperatures, scheme A is more efficient than 
scheme B. This trend increases when the accuracy of the simulation is reduced (higher 
truncation weights ep and e t ), as documented in the left part of Figure 6. Figure 5 shows 
that the maximum bond dimensions max^M^(/3, i) evolve almost in the same way as the 
computation costs. For the noncritical case J z = 3, one sees however that the maximum 
bond dimensions occurring in scheme A are for all temperatures smaller than those occurring 
in scheme B ) whereas the computation cost for scheme B is lower at higher temperatures. 
Also, for J z — 1, the maximum bond dimensions occurring in the two schemes are quite 
similar. 

These findings can be explained as follows. In Ref. [10] it was pointed out that the 
operator (15) of scheme B is time-independent for the simple case A = 1, whereas the 
computation cost for the MPO (14), occurring in scheme A ) can increase with time, even in 
this trivial case. More generally, the following argument applies for all operators A with finite 
spatial support supp(A). Typical condensed matter systems are quasi-local [45, 46], i.e., the 
spatial support of operators like e~ %m Ae lHt ^ occurring in scheme £>, grows only linearly 
with time. More precisely, outside a certain space-time cone originating from supp(A), the 
evolved operator acts almost like the identity and does hence not change the entanglement 
in that region; see Figure 7. In mathematical terms, quasi-locality means 



e ~iHt^ e iHt _ e ~iH v t j^ e iH v t 



< c||i|| e ^- dist K supp(i) ), (16) 
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Figure 6: Computation cost per time step (Mi(/3, t)) 3 ) for schemes A and B. Left: J z = 1, and 



W = A = 5^/2 w ^h (increased) truncation weights 
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Figure 7: Evolution of the MPOs (14) and (15), as occurring in schemes A and i?, respectively. 
The plots show the bond dimensions Mi = Mi(f3, t) for A — S~£, 2 , high as well as low temperatures, 
and the critical point J z = 1 as well as the gapped J z = 3. The truncation weights were e# = 10 -12 
and et = 10 -10 . Unlike in scheme A, the MPO for scheme B remains unchanged outside a certain 
space-time cone also at high temperatures. This is due to the quasi-locality (16); (right). 
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where all terms hi in the Hamiltonian H = hi are required to be short-ranged and norm- 
bounded, Hy = ^2i e v is the Hamiltonian truncated to a vicinity V of the spatial support 
of the operator A (supp(^4) C V), dV is the boundary of V and v oc sup^ \\hi\\ is the Lieb- 
Robinson velocity [45, 46]. So the norm-distance between the exactly evolved operator and 
the operator evolved on the subsystem V only, decays exponentially with increasing distance 
of dV to a space-time cone defined by the Lieb- Robinson velocity v : and originating from 
supp(A) at time zero. As the evolved operator e~ lHt Ae lHt : hence, behaves like the identity 
outside the specified space-time cone, it does not alter the operator [e -/3if / 2 ] in that region - 
in particular, not the MPO bond dimensions Mi - for the product (15) occurring in scheme 
B. This effect is very well visible in the plots for J z — 1 and f3 = 3 shown in Figure 7. For 
both tDMRG schemes, the maximum bond dimensions occur in the middle of the system, 
where A = S^ 2 acts, and grow (exponentially) with time t. The maxima have comparable 
values. Whereas, in scheme B, Mi remains unchanged outside the Lieb-Robinson space-time 
cone due to the quasi-locality, they grow for all bonds in scheme A. This implies according 
to Eq. (10) an increased cost for scheme A at those temperatures. 

Nevertheless, as Figures 4-7 exemplify, scheme A is often advantageous at very low 
temperatures, especially for noncritical systems. For low temperatures, the quasi-locality is 
not essential, as e~^ H l 2 limits the effect of the real-time propagators e ±lHt to a low-energy 
subspace. Sufficiently far away from the support of A, the resulting dynamics is then very 
restricted and can not lead to a big increase of the bond dimensions. This is true for both 
schemes. Furthermore, in the middle of the system, where A = S^ 2 acts, the growth of the 
bond dimensions is slower for scheme A which can be attributed to the fact that one acts 
with one propagator in Eq. (14) instead of acting with two propagators in Eq. (15). As the 
plots for J z = 1, 3 and f3 = 40, 48 in Figure 7 exemplify, the effect is much more pronounced 
for gapped systems. 

The described properties have also been found for other local operators A and B like 
S z L i 2 etc. The behavior for non-local operators like S^ =Q := is also quite similar as 

exemplified in Figure 6 for J z = 0,3. Please note that [S£ =0 ,H] = in the isotropic case 
J z = 1 for which the evolution of S^ =Q (t) is hence trivial. 

V. OPTIMIZED SCHEMES AND NEAR-OPTIMAL SCHEME C 

Schemes A and £>, Eqs. (9) and (12), are typically far from optimal. One has a lot of 
freedom in designing a scheme that is as efficient as possible. The choice for a specific 
measure for the efficiency of a scheme can depend on the available computational resources. 
In this paper, the computation cost per time step as quantified by ^ (M^(/3,i)) 3 [Eq. (10)] 
and, as an alternative, the maximum bond dimension max^M^(/3, i) are considered as useful 
measures. With a nonsingular operator T, the most generic scheme involving two MPOs is 
Z' 1 Tr ( [e iAt BT] [T _1 e"^*i4e"^]). As pointed out in Section III, to optimize efficiently 

with respect to generic T is in general not possible. Thus, it is reasonable to constrain 
ourselves to a certain subclass of schemes and to optimize the efficiency over the remaining 
degrees of freedom of that class. Let us consider the class of schemes 

xf^'iM = ^-Tt ([e^e-^Be"^'] [ e -i6(t-n A e -^ 6 e iA ^] ) . (17) 
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Figure 8: The computation cost per time step J2i (Mi(f3,t)) 3 (top and bottom) and the maximum 
bond dimension max^M^(/3,t) (middle) for the optimized evaluation scheme (18) with J z = 0,1, 
and 3, respectively. The schemes for W — A — S^ 2 , S^ =Q were optimized with respect to f3 f and 
t 1 . The corresponding efficiency measures were chosen to be the computation cost per time step 
for the top and bottom rows of diagrams, and the maximum bond dimension was chosen for the 
middle row of diagrams. In all computations, the truncation weights were chosen as e# = 10 -12 , 

<, io L0 . 

As before, the two factors in square brackets are to be approximated by MPOs [62] . For given 
inverse temperature /?, time i, and accuracy as quantified by the truncation weight e [Eq. (8)] 
one can now optimize the chosen efficiency measure with respect to i', and t" . Scheme 
A corresponds to the choice (/?', £',£") = (/3/2,i,0), and scheme B to (/3 7 , *',*") = (/5/2,0,0). 
It is unpractical to explore the full three-dimensional parameter space, and we constrain 
ourselves in the following to the subspace with t' — t" , i.e., to the schemes 

xZ(M = ±-Tr ([e-^B(t')] [A{1? - t)e~W]) 

= A- Tr ( [e iAt 'e-P' 6 Be- i6t '] [ e -i6{t-^ e -(p-p')6 e i6(t-^ j (lg) 
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Figure 9: As this comparison for J z — 1 and W — A — 5^/2 snows ( an time axes have the same 
scales), the maximum reachable times in the optimized scheme exceed those of the earlier schemes 
roughly by a factor of two. The efficiency measure for the optimization was chosen to be the 
computation cost per time step (left) and the maximum bond dimension (right), respectively. The 
truncation weights were chosen as e# = 10 -12 , e t = 10 -10 . 



2^ax(f)<^ax(/3-/3') + ^ax(/3') 
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[ e -^H/2] 
/ \ 



[ e -iHt A A e -(3H/2 e iHt A ^ 
^ e iHt B e -(3H/2 g e -iHt B ] 

y 

Tr ([e^e-^/ 2 Be-^][e-^ie-^/ 2 e^]) 



Figure 10: Left: Two possible scenarios for the maximum times t£? a x [Eq. (19)] reachable with the 
tDMRG scheme (18) for the case A = fit and given computational resources. If t^ Lax (p / ) is concave 
on the interval [0,/3], the optimal scheme corresponds to f3 f = /3/2, i.e., scheme C. In the examples 
studied here, t^ iax (/3 / ) was found to be almost concave in the relevant temperature ranges. Right: 
Evaluation scheme C for the evaluation of the response function according to Eq. (20). 
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Figure 8 shows, with W = A, the computation cost per time step and the maximum bond 
dimensions for the optimized scheme. The maximum reachable times t^ x of the optimized 
scheme are at least twice as large as the reachable times ^axC^) f° r scheme B, specifically 
^max(2/5) ^ 2£^ ax (/3). See also Figure 9. The factor two in the temperature argument is 
inessential, as one finds that £ m ax(2/3) ~ t max (/3). This is due to the fact that £ max varies 
slowly as a function of log/3 for all response functions considered in this paper. Let us 
discuss the possible scenarios for the optimum values of ft and t f for the case W = A: The 
computation cost for scheme B is dominated by the tDMRG computation of the operator 
^ e -iHtA e -p'H/2 e iHt^ - n (12). The corresponding maximum time reachable for some given 
computational resources, truncation weight e, and inverse temperature /3 f shall be denoted 
by imaxO^')- The maximum reachable time for the scheme (18) as a function of f3 is then 
given by 

Ca X (/5) = maxcK^ (f*^) + tj^tf - /?')) - (19) 

There are two possible scenarios, as depicted in Figure 10. If imax(/^) 18 concave on the 
interval [0,/3], the optimal scheme corresponds to /3' = /3/2. Otherwise, there will be an 
optimum f3 f ^ (3/2. In the examples studied here, imax(/^) was f° un d to be almost concave 
in the relevant temperature ranges. 

One can hence devise a corresponding scheme C that is near-optimal for W = A, does 
not require any optimization, but outperforms scheme B by a factor of two in the maximum 
reachable times. It is only outdone by scheme A at very low temperatures. The scheme 
corresponds to the choice /?' = /3/2, i.e., (/?',£',£") = (P/2,t B ,t B ) in Eq. (17) 

X%WM + t B ) = J-Tr {y^B{t B )\ [A(-t A )e-^]) 

= ^ Tr ([e^e-^ 2 Be-^] [e^ie^/V^]) . (20) 

After the imaginary time-evolution that yields [e~2^] 5 one runs two real-time tDMRG sim- 
ulations to obtain MPOs [e i6tB e~2 6 Be~ i6tB ] and [e~ i6tA Ae~2 6 e i6tA ]. With Eq. (20) one 
then obtains X^(/3, tA+t B )] see Figure 10. The accuracy of the MPOs should be kept under 
control during the whole simulation, for example, as described in Section II by bounding 
the truncation error [62] . If this is done properly, it is of minor importance what specific t a 
and t B — t — tA are chosen to evaluate XabW^) f° r a gi ven time t. For the case A = B\ 
the maximum reachable times for tA and t B are equal, and the total maximum reachable 
time with this scheme is then twice as large as the maximum time of scheme B. This makes 
many more physical applications accessible. 

VI. tDMRG VERSUS TMRG CONTRACTIONS 

The time-evolution of matrix product states or density operators, as employed in the pre- 
sented evaluation schemes for thermal response functions, can be implemented within the 
tDMRG framework in several different ways. The results for the computation costs of the 
different evaluation schemes are essentially independent of the chosen time-evolution algo- 
rithm. The currently most common choice [11 13] employs a Suzuki- Trotter decomposition 
[54-58]. Alternatively, one can use for example the Arnoldi method, Runge-Kutta method 
or other Krylov subspace approaches [59-61]. 
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Let us consider in the following a Hamiltonian H = J2f=i hi with nearest-neighbor inter- 
actions hi, operators A and B with finite spatial support (for simplicity single sites), and 
implementing the time-evolution with a Suzuki- Trotter decomposition. This case allows for 
an alternative evaluation of the thermal response functions on the basis of the transfer ma- 
trix renormalization group (TMRG) [32-35]. The Suzuki- Trotter decompositions of real- or 
imaginary-time propagators read 

\ N 

= , e -a m+1 Ar#! J-J ^Arftg-anAaffij + q ((At)^ +1 ) (21) 



where r = /3 or r = ±z£, respectively, At = r/N is the time step, and contain the 
Hamiltonian terms on even and odd bonds, respectively. The coefficients a n and b n sum 
up to one (^ n a n ,^ n b n = 1) an d define different decompositions of order p and number 
of stages m per time step. A common choice is the leapfrog algorithm which has m = 1 
stages and order p = 2 with ai = a 2 = 1/2 and 6i = 1. In this work, a decomposition 
with m = 5 stages and order p = 4 was used. When the operator exponentials occurring 
in the formula Z^ 1 Tr(e~^ H ^ 2 e lHt Be~ tHt Ae~^ H ^ 2 ) for the response function in scheme A, 
Eq. (9), are decomposed by such a Suzuki- Trotter decomposition, one ends up with the task 
of contracting a 2D tensor network of local gates as displayed in Figure 11 with the space 
direction being horizontal and the time direction being vertical. 

The tDMRG approach scheme A corresponds to starting with a trivial MPO [1] that 
represents the identity with bond dimensions Mi = 1 V^ji^]- Beginning at the bottom, 
one then contracts one row of local gates after another to that MPO until reaching the 
row that contains operator Z?, yielding [e~ lHt Ae~^ H / 2 \ . Multiplying, in one step of this 
iteration, a single row of gates with the current MPO in an exact manner, gives a resulting 
MPO with increased bond dimensions. A subsequent truncation step reduces the bond 
dimensions again. The degree to which the bond dimensions are reduced in the truncation 
steps is controlled by the predefined truncation weight (8) which determines the accuracy 
of the computation. Similarly, starting from the top, one contracts rows of gates to obtain 
[e~P H / 2 e lHt \. Finally, one computes the response function by evaluating Eq. (9) as visualized 
in Figure 1. 

In the TMRG approach, the columns of local gates in Figure 11 are interpreted as trans- 
fer matrices T = T(/3, t) (also and Tg) and operate on a Hilbert space of dimension 
^/3/A0+2t/At^ w h ere d denotes the dimension of the single-site Hilbert space. The last column 
corresponds to a pure state |?/^) and the first column to a dual state (^ |. Those transfer 
matrices and states have matrix product representations with bond dimensions Mi < d 2 V^. 
Starting from the left and the right with the states ip or ip L , respectively, one transfer matrix 
after another is applied with intermediate truncation steps, for example, until one reaches 
the column containing operator B. The response function is then given by the overlap of 
the two resulting MPS, [(ip Q \f • • • ff A f • • • f\[f 6 f • • • f\ip L )]. 

The advantages of the tDMRG approaches, pursued in this work, are that, (i), the number 
of matrices in the MPOs is fixed by the lattice size L instead of growing linearly with the 
time t and inverse temperature /?, (ii), the operators A and B can have non-local support, 
(iii), one can evaluate the spectral function for all times £, up to the maximum reachable 
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Figure 11: Visualization of the 2D tensor network that has to be contracted for the evaluation of the 
thermal response function Z^ 1 Tx{e~ m e im Be~ iHt A) when using a Suzuki- Trotter decomposition 
of the operator exponentials. This can be done either using tDMRG or using TMRG. In the 
tDMRG method, one contracts one row of local gates after another (with intermediate truncation 
steps), evolving two MPOs that are oriented horizontally. The final evaluation of the response 
function according to Eq. (9) corresponds to taking the Hilbert-Schmidt scalar product of the two 
MPOs. Alternatively, one can contract this tensor network with TMRG. In this case the meaning 
of states and dual states is changed as indicated on the right side of the figure. One starts on the 
left and on the right with vertically oriented MPS. One column of local gates after another, the 
transfer matrices, is applied to those MPS, again with intermediate truncation steps to reduce the 
bond dimensions. Finally, the expectation value is obtained by evaluating the scalar product of 
the two MPS. 

time, by two tDMRG runs whereas, in the TMRG approach, one has to do a new calculation 
for every time t of interest. 

Concerning (ii), it should be pointed out that non-local operators A and B would also 
be accessible in the TMRG approach as long as they are MPOs with sufficiently small bond 
dimensions. In Refs. [37, 38] the focus was on autocorrelation functions of local operators 
in the thermodynamic limit L oo. In this case, one does not need to contract one 
column after another, but can evaluate the response function directly from the left and right 
transfer matrix eigenvectors with maximum eigenvalue. When one uses the so-called infinite- 
system DMRG to obtain those eigenvectors, problem (iii) does not occur and autocorrelation 
function can be evaluated for all accessible times t in a single run. One should however be 
careful in applying infinite-system DMRG (a single build-up sweep) only, as this algorithm 
has several pitfalls [3]. At least for larger times, finite-system DMRG (multiple sweeps) 
should be necessary. 
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VII. CONCLUSION 

In this paper, I have studied and explained the computation costs of different tDMRG 
schemes for the efficient and precise evaluation of finite-temperature response functions of 
strongly correlated quantum systems. Simplifying the notation to some extent by formu- 
lating everything in terms of MPOs, elucidated the effects of quasi-locality on the costs. 
The new class of optimizable evaluation schemes typically outperforms the earlier schemes 
from the literature in terms of the maximum reachable times by a factor of > 2. This gives 
access to many more physical applications. The novel near-optimal scheme, scheme C, that 
requires no additional optimization, can be expected to be the method of choice for future 
applications. Finally, it has been argued that the tDMRG schemes are in some respects fa- 
vorable to corresponding TMRG variants. As a first example, the novel scheme C has been 
used in Ref. [47] to calculate, for ID bosons in the quantum critical regime with dynamic 
critical exponent z — 2, the universal scaling function for the thermal spectral function. 

Discussions with J. Barthel, C. Karrasch, A. Kolezhuk, I. P. McCulloch, S. Sachdev, J. 
Sirker, and U. Schollwock are gratefully acknowledged. 
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[62] Please note that the expressions (9), (12), (17), and (20) for the response function in the dif- 
ferent evaluation schemes give the same values, the exact Xab(@i ^) °^ (-0' om y ^ ^he MPO 
representations of the operators in square brackets are exact - corresponding to exponentially 
big bond dimensions max^M^ = 0{d L ). A guiding principle of the DMRG approach is that 
in typical condensed matter applications such exponentially big Mi are not required. Corre- 
sponding truncations of the MPOs result in slight deviations of the values from the different 
evaluation schemes. It is essential to keep the errors in these MPO truncations controlled and 
small at all times. 



